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Abstract 

Profile likelihood intervals of large quantiles in Extreme Value distributions provide 
a good way to estimate these parameters of interest since they take into account the 
asymmetry of the likelihood surface in the case of small and moderate sample sizes; 
however they are seldom used in practice. In contrast, maximum likelihood asymptotic 
(mla) intervals are commonly used without respect to sample size. It is shown here 
that profile likelihood intervals actually are a good alternative for the estimation of 
quantiles for sample sizes 25 < n < 100 of block maxima, since they presented adequate 
coverage frequencies in contrast to the poor coverage frequencies of mla intervals for 
these sample sizes, which also tended to underestimate the quantile and therefore might 
C"-- be a dangerous statistical practice. 

In addition, maximum likelihood estimation can present problems when Weibull 
models are considered for moderate or small sample sizes due to singularities of the 
corresponding density function when the shape parameter is smaller than one. These 
estimation problems can be traced to the commonly used continuous approximation 
to the likelihood function and could be avoided by using the exact or correct likeli- 
hood function, at least for the settings considered here. A rainfall data example is 
presented to exemplify the suggested inferential procedure based on the analyses of 

$—1 ' profile likelihoods. 
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1 Introduction 

According to the Fisher-Tippet theorem [2], only three families of distributions are the limits 
for the distribution of normalized maxima of i.i.d. random variables: Weibull, Gumbel, and 
Frechet. These three families of Extreme Value distributions (EV) are submodels of a single 
family of distributions proposed independently by Von Mises [7] and Jenkinson [3] which is 
now known as the Generalized Extreme Value distribution (GEV). 



Usually large quantiles Q Q of probability a of these distributions are of interest. Different 
confidence intervals for these quantiles can be obtained depending on the model used, the 
GEV or a specific subfamily of models-Frechet, Gumbel or Weibull. Under the selected 
model, the usual procedure is to obtain asymptotic maximum likelihood (ami) confidence 
intervals which are symmetric about the maximum likelihood estimate (mle) and usually do 
not take into account the commonly marked asymmetry of the likelihood surface of large 
quantiles in the case of small or moderate samples and thus tend to underestimate the true 
value of the quantile. 

Profile likelihood intervals for quantiles have not been fully explored in statistical liter- 
ature for Extreme Value Theory and neither have their coverage properties in the cases of 
small and moderate samples. In this work, the coverage frequencies and lengths of likeli- 
hood intervals for quantiles are explored and compared to those of ami confidence intervals 
through a simulation study. 

In addition, the profile likelihood intervals for the shape parameter of the GEV were 
also considered and shown to have good coverage frequencies. These intervals are of special 
importance since they can be used as an aid for submodel selection. 

The use of the exact likelihood function, described in the following section, is recom- 
mended for the case of small sample sizes where a Weibull model might be reasonable, in 
order to avoid maximum likelihood estimation problems due to singularities of the corre- 
sponding density function. 

As an example, a data set of yearly rain maxima collected at a monitoring station in 
Michoacan, Mexico is presented to exemplify the likelihood based estimation procedures. 

2 Relevant Related Statistical Concepts 



The relative and profile or maximized likelihood functions of a parameter of interest will be 
presented here. In addition, the exact or correct likelihood function is defined as well. These 
functions contribute to simplify and improve the estimation of parameters of interest such 
as quantiles of Extreme Value distributions. Also, expressions for the probability densities 
and distribution functions of all the models involved are here provided, as well as for their 
corresponding quantiles, which are the main parameters of interest. 
The densities of the three EV families for maxima are 



Gumbel: \(x; //, a) = — exp < — exp 

a y 

Frechet: <p(x; /i, a, j3) — — 



x — fl 



a 



x — fl 



a 



-'(—00,00) \X ) i 



T 



a \ a 



-0-1 



exp 



x — jX 



a 



I\n,oo) (x) , and (2) 



Weibull: 



a 



/3 f \x — x 



a 



0-1 



exp 



ji — x 



a 



'(— 00,/i] V") 1 



(3) 



with location, scale and shape parameters \x e M., a > and /3 > 0, respectively. For the 
Weibull and Frechet densities, \x is also a threshold parameter, since it represents an upper 
or lower bound, respectively, for the support of the corresponding random variable. Note 
that for (3 < 1, the Weibull density has a singularity at x — /i. 



The Generalized Extreme Value distribution (GEV) density function is 
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where a, b, c are location, scale and shape parameters, respectively, b > and a, c £ 1, The 
GEV corresponds to the Weibull, Gumbel, or Frechet distributions according to whether c is 
negative, zero, or positive, respectively. Note that the expression given for c = is the limit 
of g(z; a, b, c) when c tends to zero. The parameters of the EV models and the corresponding 
GEV are connected through a one to one relationship given in Table 1. 
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Table 1. Parameters for the EV and GEV distributions 

In the case of the Weibull and Frechet models for maxima, the threshold is isolated in 
a single parameter \x that may have a clear physical interpretation. Inferences in terms of 
estimation intervals for this parameter are simpler with an EV distribution in contrast to 
the corresponding threshold for the GEV, which is a function of all three parameters a, b, c. 

It is important to note that there exist Weibull and Frechet models that are very close 
and practically indistinguishable from a Gumbel model. That is, the Gumbel distribution is 
a limit of Weibull distributions with parameters related as shown in Table 1. The Gumbel 
model is embedded in the Weibull family of models in this sense, as well as in the Frechet 
family (Cheng and lies pQ). 

All these models can be parametrized in terms of a quantile of interest by direct algebraic 
substitution in (CQ), (j2j) and (jSJ) since any quantile can be expressed as a function of the other 
parameters as shown in Table [2j Therefore, the model can be expressed in terms of the 
quantile of interest which substitutes one of the remaining parameters. For example, the 
Weibull model can be reparametrized in terms of (Q a , a, (3) instead of (/i, a, (3). 
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Table 2. Quantiles for the EV and GEV distributions. 

The asymptotic properties of maximum likelihood estimators are invoked in order to 
obtain confidence intervals for the parameters of interest. Usually the continuous approx- 
imation to the likelihood function as defined in Kalbfleisch |4i is the one used in most 



statistical textbooks to define the likelihood function for continuous random variables, with- 
out taking notice that it is an approximation. For an observed sample of n independent 
continuous random variables identically distributed, the continuous approximation to the 
likelihood function is 
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where 9 is the vector of parameters, and / is the density function of the selected model. 

This continuous approximation to the likelihood is only valid if the density functions do 
not have singularities (see Montoya et al [6]). For example, for a given observed sample, 
the joint Weibull density has a singularity when the threshold parameter equals the largest 
observation, /x = X( n ), if the shape parameter ft is smaller than one, ft < 1. 

However, the data are always discrete since all measuring instruments have finite pre- 
cision. Therefore, the data can only be recorded to a finite number of decimals. Thus the 
observation X = x can be interpreted asi- \h < X < x + \h, where h is the precision of 
the measuring instrument, and so is a fixed positive number. For independent observations 
x = (xi, ..., x n ), the exact or correct likelihood function L^ is defined to be proportional 
to the joint probability of the sample, 
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where F is the corresponding distribution function of the continuous model in consideration. 

Allowing h = implies that the measuring instrument has infinite precision and that 
the observations can be recorded to an infinite number of decimals. Since for a continuous 
random variable X, P(X = x; 9) =0 for all x and 9, this cannot be the basis for obtaining 
a likelihood function. If in contrast, one assumes that the precision of the measuring in- 
strument is h > 0, then conditions are required for the density function / (y; 9) to be used 
as an approximation to the likelihood function (jSJ) , as required by the Mean Value Integral 
Theorem of Calculus. But if the density function has a singularity at any given value of 9, 
then these conditions are violated and / (y; 9) cannot be used to approximate the likelihood 
function at that value of 9 ([3], Section 9.4). 

As Meeker and Escobar ([5], p. 275) mention, there is a path in the parameter space 
for which the continuous approximation to the likelihood (j5]) goes to infinity, in particular 
for the Weibull case, when ft < 1 and /i — > xt n y It should be stressed that the likelihood 
approaches infinity not necessarily because the probability of the data is large in that region 
of the parameter space, but instead because of a breakdown in the density approximation to 
the likelihood function. There is usually, as happened with all simulations considered here, 
though not necessarily always, a local maximum for this likelihood surface corresponding to 
the maximum of the exact likelihood based on the probability of the data shown in §§§ • 

A useful standardized version of a likelihood function L (9; x) that will be used here, is 
the relative likelihood function that has a value of 1 at its maximum, the mle 9, and is 



defined as 
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so that < R(6;x) < 1. Values of 9 with R (8; x) close to one are more plausible than values 
close to zero. A relative likelihood is easy to plot and to interpret. Likelihood intervals or 
regions of k% likelihood level are obtained by cutting horizontally this likelihood function; 
that is 

{6 : R (9; x) > k} , 0<k<l. (8) 

For example, if k = 0.15, under some regularity conditions, the corresponding likelihood 
interval has an asymptotic approximate 95% confidence level, using the Chi-square limit 
distribution for the likelihood ratio statistic (jl] Section 11.3). However this result may also 
hold for moderate samples, and even small samples, if the likelihood surface is symmetric 
about the mle. In these cases the interval in flH]) is called a likelihood-confidence interval. 

If the GEV model is parametrized in terms of a quantile of interest, then the profile or 
maximized likelihood function of Q a (Kalbfleisch, 1985, Section 10.3) is defined for sample 

L p (Q a ; x) = max L (Q a , b, c; x) . 

b,c\Q a 

The corresponding relative likelihood can be calculated as in ([7]) . Profile relative likelihoods 
and their plots are very informative about plausible ranges for the parameter of interest, in 
the light of the observed sample. 

In the case of the profile likelihood of the GEV shape parameter c, the relative likelihood 
at c = is indicative of the support given by the sample to the Gumbel model, which 
corresponds to c = 0. For example if R p (c = 0) > 0.5, the Gumbel model has moderate or 
high plausibility and should definitely be considered as a possible model; its fit to the sample 
should be compared with the fit of the best member of the family of EV models suggested 
by the sign and value of the mle c. 

Summarizing, in order to make inferences about a parameter of interest, for example a 
quantile, the corresponding plot of the relative profile likelihood should be analyzed because 
it is very informative. Inferences about the parameter of interest should be presented in terms 
of likelihood-confidence intervals, especially in the case of small or moderate samples. These 
intervals calculated for two large quantiles, Q.95, Q.99, and for the GEV shape parameter c 
showed through simulations, reported in the following sections, to have adequate coverage 
frequencies for moderate sample sizes (n > 50), and even for n = 25 in the case of Gumbel 
and Frechet models. 

3 Simulations 

For the simulation study, the samples of maxima were chosen to come from one of the EV 
distributions, (or equivalently a GEV distribution) and not from a distribution belonging to 
the domain of attraction of an EV. Samples were simulated from the GEV with parameters 
a = 1, b = 1 and 

c G {-0.5, -0.4, -0.3, -0.2, -0.1, -0.05, 0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5} , 



for sample sizes of n = 25 and 50. Additional values of c, ±0.01 and ±0.001 were considered 
as well as the previous ones, for n = 100 in order to explore the cases around c = 0. These 
cases are such that there are models from the three subfamilies of EV that are very close to 
each other. 

Size 50 is frequently found in samples coming from meteorological applications, and 
sample size 100 was chosen to explore the effect of increasing sample size. For each value of 
c and sample size, 10,000 samples were generated in Matlab 7. 

For each sample of maxima, the mle's of the parameters (a, b, c) of the GEV distribution 
were calculated using the continuous approximation to the likelihood function. This is the 
current procedure in Extreme Value literature. The cases where the singularities of this 
density caused numerical problems for finding the local maximum (the mle) were registered 
and the exact likelihood function was used then to obtain the mle's. 

For each simulated sample, the corresponding EV model was selected automatically ac- 
cording as c < — 10~ 5 (Weibull), \c\ < 10~ 5 (Gumbel) or c > 10~ 5 (Frechet). The mle's of the 
corresponding parameters were obtained by maximizing the likelihood derived from (0Q), (J2J) 
or ([3]), accordingly, reparametrized in terms of the quantile of interest, which worked well in 
most of the cases. Only when c < — 1 and J3 < 1, it was necessary to use the corresponding 
exact Weibull likelihood function, as mentioned above. These cases were registered, since 
they represent cases where the continuous approximation to the likelihood function would 
not have been able to produce an mle with these EV distributions. 

Using the invariance property of the likelihood function, the mle's of quantiles Q.95 and 
Q.gg can be obtained from the mle's of the parameters of the EV or GEV, though they 
were obtained directly from the corresponding likelihood function parametrized in terms 
of these quantiles. From their corresponding relative likelihoods, 15% likelihood intervals 
were obtained for c, Q.95, and Q.99. As mentioned above, these intervals may have an 
approximate 95% confidence level in the case of moderate sample sizes, using the Chi-square 
limit distribution for the likelihood ratio statistic ([I] Section 11.3). For each of these intervals 
it was checked whether they included the true value of the corresponding parameter in order 
to calculate the associated coverage frequency. For those intervals that excluded the true 
value of the parameter of interest, the number of times that the interval underestimated or 
overestimated was registered. Also the lengths of the intervals that covered the true value of 
the parameter were registered and compared as shown in the following section. In addition, 
the asymptotic maximum likelihood (ami) confidence intervals were obtained for Q 95 and 
Q.99 and their coverage frequencies were registered. 

4 Results 

Tables 3 and 4 present the coverage frequencies for Q. 95 and Q.99 of 15% relative profile 
likelihood intervals and their corresponding ami intervals in the case of samples of size 
n = 25, 50, and 100. Asymptotically these 15% likelihood intervals should have 95% coverage 
frequencies. Table 5 gives the coverage frequencies of 15% relative profile likelihood intervals 
for the parameter c of the GEV model for samples of size 100 and 50. The last two columns 
of this table report for each scenario the number of samples that selected the correct EV 
model according to the sign of the mle c and the number of samples where the product of the 



interval endpoints was negative. These are cases where the three EV models are plausible, 
since the value of c = is included in the interval. 

Figure [U shows the coverage frequencies of the quantiles of interest contained in Tables 
3 to 5 in a graphical way. Figures [2] and [3] show the ratios of the lengths of the relative 
profile likelihood intervals under the selected EV model compared to those under the GEV 
model and Figures H] and [5] give the length of profile likelihood intervals for the GEV using 
boxplots in which the box corresponds to the interquartile range and the whiskers have a 
maximum length of 1.5 times the interquartile range. Points beyond the end of the whiskers 
are represented individually and the line inside the box is the median. Only samples for 
which all intervals covered the true value of the quantile were considered in these graphs. 

Some remarks about the tables and figures are given below. Note that EV submodels 
are selected automatically, based only on the sign and size of c, so the reported coverage 
frequencies correspond to a 'worst case' scenario. With a real data set, additional external 
information from experts would be taken into account for choosing an adequate submodel, 
and consequently the statistical modeling would be more efficient. 

1. Coverage frequencies of GEV profile likelihood intervals and number of 
samples with estimation problems. Coverage frequencies of relative profile like- 
lihood intervals for the GEV were very stable throughout the range of values of c for 
both quantiles. They tend to decrease as c moves towards more negative values. For 
n = 100 there were no numerical problems when calculating the mle's. For n = 50 the 
number of samples with numerical problems was insignificant. However for n = 25, 
more samples presented problems in the case of Weibull models with values of c smaller 
than —0.2. The number of problematic cases grows as c goes to —0.5 and is above 1.8% 
for c = —0.4 and above 5% for c = —0.5. The number of samples that had numerical 
problems was the same for both quantiles considered. Therefore, numerical problems 
are associated to small sample sizes and Weibull models with large negative values of 
c. 

2. Coverage frequencies of EV profile likelihood intervals. Coverage frequencies 
of relative profile likelihood function intervals for the EV were not so stable, and in 
all cases there is a region of decrease, mainly in the Frechet domain, where frequencies 
drop, as shown in Figure [TJ This region grows wider as the sample size gets smaller, 
and the value where the minimum occurs shifts to the right from around 0.1 for n = 100 
to around 0.2 for n = 25. The drop is always more pronounced for Q 99 than for Q.95. 
This can be explained by the fact that for the samples that did not cover the true value 
of the quantile, the mle c was negative in most cases and the whole interval lay below 
this true value and therefore underestimated it (see the second columns in Tables 3 and 
4). In the Frechet cases, these problems were associated to estimating a large Frechet 
quantile with a Weibull model that has a bounded right tail. 

3. Coverage frequencies of ami intervals. Ami intervals always had poorer coverage 
frequencies than relative profile likelihood intervals for the GEV for all the sample 
sizes considered here. Coverage frequencies for ami intervals calculated for the GEV 
and EV distributions are almost identical. Although coverage frequencies for these 
intervals improve as the sample size grows, as predicted by asymptotic theory, they 
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can be very poor for n = 25 and 50, and still unsatisfactory even for n = 100. This 
indicates that samples of greater size are required for these intervals to have suitable 
coverage frequencies. In all cases the intervals that failed to cover the true values 
tended to underestimate them. 

4. Asymmetry of proportions of intervals that exclude the true value. Except for 
one single case (n = 50, Q.qq, c = 0.5) there were always more relative profile likelihood 
intervals that underestimated than overestimated the true value of the quantile. This 
asymmetry is more pronounced for smaller sample sizes, n = 25. The asymmetry also 
increases as c becomes smaller and is very marked in the Weibull case. This may be 
due to the fact that the Weibull distribution has a finite upper limit and intervals 
tend to increase in size with c. Therefore estimating a large quantile from a sample 
with c « c will tend to underestimate the true value while in the case c » c the 
interval will be larger and more likely to include the true value. However, even if this 
asymmetry is not desirable, the asymmetry of ami intervals is certainly much more 
marked than the one for profile likelihood intervals. 

5. Interval lengths. Almost always intervals obtained with the GEV models are larger 
than those obtained with EV distributions as shown in Figures [2] and [31 Only samples 
where both intervals included the true value of the parameter were considered. The 
length of the intervals tended to be alike for large values of |c|, although there is some 
asymmetry in this, with Frechet intervals being closer in length than the corresponding 
Weibull cases. Also, the ratio of lengths is closer to one for Q.95 than for Q.99. For 
both quantiles the largest difference occurs at c = —0.05 for n = 100 and 50, and at 
c = —0.1 for n = 25. In Figures [2] and [3l the region where the interquartile boxes are 
visible (i.e. where the length differences are more important) coincides roughly with 
the region where there is a drop in the coverage frequencies for the EV distributions. 
This shows that there is a trade off between coverage and precision in the choice of a 
model: There is the possibility of gaining precision in the estimation but a the risk of 
reducing the confidence level of the interval. It is important to note that for the same 
quantile and sample size, the lengths of confidence intervals grow with c, as shown by 
Figures H] and [51 This is to be expected since Weibull distributions are bounded above 
while Gumbel and Frechet are not. Figure [6] shows the length between the true values 
of Q.01 and Q. 99 of the corresponding distribution, as the parameter c increases. 

6. Effect of sample size on interval length. As one would expect, the length of 
the intervals decreases as the sample size increases, but not uniformly. Halving the 
sample size from n = 50 to 25 increases interval length by a factor between 1.84 to 
2.65, depending on the value of c, and by a factor of 1.56 to 1.78 when decreasing from 
n = 100 to 50. Also, for a fixed sample size the length of intervals for Q.99 is always 
larger than those of Q.95, as shown in Figure [5j 

7. Coverage frequencies of GEV shape parameter c. The coverage frequencies of the 
profile likelihood intervals of this parameter, shown in Table 5, are stable throughout 
the range of values of c, with a slight decrease for the more negative values of c. The 
proportion of intervals that underestimate is much larger than those that overestimate 



the true value of c, especially in the Weibull cases. This asymmetry diminishes as c 
takes larger positive values. 

8. Asymmetry in the correct automatic selection of a model. The number of 
simulated samples where the estimator c has the same sign as the true value of c, 
as the column "correct" shows in Table 5, depends on the value of c. Although the 
difference is not pronounced, it is always more likely for the same value of \c\ that the 
signs coincide in a Weibull case than in the corresponding Frechet case. On the other 
hand, it is more likely that intervals in the Frechet case cover the origin, and therefore 
make plausible a Gumbel model, as the "negative" column shows in Table 5. 

5 Rain Data Example 

In the state of Michoacan, Mexico, near its capital city Morelia, there is a monitoring me- 
teorological station located at the Cointzio dam. This station is representative of rainfall 
patterns in this area. Yearly maxima of daily rainfall were obtained for 58 years in a period 
between 1940 and 2002. In this area, there is a marked rainy season from May to September. 
This data set will serve to illustrate the statistical modelling procedures suggested here. As 
a first step, the relative profile likelihood of the GEV shape parameter c shown in Figure 
[7(a) assigns plausibility only to positive values of c and the mle is c = 0.21. therefore sug- 
gesting a Frechet model. Since rain data are necessarily non-negative, for physical reasons 
it is important to consider a Frechet model with a non-negative lower threshold parameter 
/i > that could very well simplify to a two parameter Frechet model, where \x = 0. The 
relative profile likelihood of /i under the three parameter Frechet model shown in Figure 
[7(b), clearly assigns a very high plausibility to the value of // = 0, so that the data appear to 
support strongly a two parameter Frechet model. Under this model, the maximum likelihood 
estimates are 
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Figures [8(a) and [8(b) present together, for the sake of comparison, the corresponding 
relative profile likelihoods of these large quantiles of interest under the two parameter Frechet 
model and also under the GEV model without any restrictions to its parameters. The GEV 
model without restrictions for its threshold corresponds as well to a three parameter Frechet 
model without restriction to its threshold parameter; the corresponding Frechet mle's are 
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In terms of the GEV distribution's parameters, the mle's are given by 
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The likelihood intervals obtained for these quantiles with the GEV model are larger 
and imply that larger values of these quantiles are plausible. Also in these graphs, the 
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ami GEV intervals are marked and show that their right endpoints tend to coincide with 
the right endpoints of the profile likelihood intervals of the two Frechet model for these 
quantiles; nevertheless the left points are much smaller than the other likelihoods endpoints 
and therefore include small values of the quantiles that are implausible under both models 
(two parameter Frechet and the GEV). That is, the ami intervals tend to underestimate the 
values of the quantiles. 

The likelihood ratio statistic of these two models for this data set is 

^Frechet ( /i = 0, <J, /3; X ) 

W = ^ r J = 0.9983. 

-^Frechet (A) & i P'l X ) 

Since these models are nested, the observed value of —2 log W = 0.0034 has p- value of 0.9535 
under the asymptotic chi-square distribution with one degree of freedom. The observed value 
of 0.9983 with a p- value of 0.32, indicates that the two Frechet parameter model makes the 
observed sample equally probable. However since the two Frechet parameter model is simpler 
and fits adequately the data set as shown in Figure [9](a), this model should be preferred. 
Figure E^a) shows the corresponding quantile-quantile plot with pointwise likelihood bands 
that includes all observed values. Moreover, this model should be taken into account due to 
the physical considerations stated above. 

Likelihood-confidence intervals of 15% likelihood level and approximate 95% confidence 
level for the quantiles of interest Q.95 and Q.99 under the two parameter Frechet model are 
(61.6, 85.06) and (83.02, 131.66) respectively. Finally Figure Mjo) shows the return periods 
plot with profile likelihood 15% level bands marked for both the GEV model and the two 
Frechet model. Since rainfall levels higher than 200ml are associated with floodings of 
Morelia, and since a return period of a 100 years is associated to quantile Q.99, then the 
probability is extremely low that the city of Morelia gets flooded within 100 years. 

6 Conclusions 

Overall, profile likelihood intervals of large quantiles of Extreme Value distributions and 
of the GEV shape parameter c performed well and had adequate coverage frequencies for 
moderate and small sample sizes. In contrast, the corresponding ami intervals are symmetric 
about the mle and had lower and poor coverage frequencies in the case of samples of size 
n < 100. Moreover, a large proportion of the ami intervals that excluded the true value 
tended to underestimate it. The ami intervals are frequently used in Extreme Value Theory 
applications without notice of these issues. 

Profile likelihood intervals of EV submodels tend to be shorter than the corresponding 
GEV profile likelihood intervals when the true value of c is close to zero, that is when c £ 
(—.05, .05) if the sample size is n < 50. Nevertheless, their coverage frequencies are adequate 
so that they should be preferred when the model selection of an EV is clear. However, if there 
is no additional external information on a given preferred EV model suggested by the theory 
behind the specific phenomenon of interest, then using GEV profile likelihood intervals is 
a conservative procedure since they also had good coverage frequencies, even though these 
intervals tended to be larger. 
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Profile likelihood intervals of c may serve as an aid in model selection. They also had 
adequate coverage frequencies. For values of c in a region around zero (—0.01, 0.01) approx- 
imately 95% of the likelihood intervals for the simulated samples included the value of zero. 
These are cases where the three EV models are plausible for the given sample, and also where 
the Gumbel model usually has a moderate or high plausibility given by the relative profile 
likelihood of c at zero. This is indicative of the need of additional external information of 
experts and other diagnostic methods to select adequately the best and most simple model 
for the phenomenon of interest. This will improve the estimating precision, and will prevent 
underestimating the quantile of interest. 

Finally, for sample sizes smaller than 50 and in the case that a Weibull model might be 
an appropriate choice, then the use of the exact likelihood function is suggested in order to 
make inferences about the parameters of interest through profile likelihood intervals. 
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Frequencies, '<' is the number of intervals that fell below the true value, '>' the number that fell 
above and SNP represents the number of samples with numerical problems. 
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15% Profile Likelihood Intervals for c with n=50 
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Table 5. Coverage frequencies for c with sample sizes 100 and 50: '<' is the number of intervals 
that fell below the true value, '>' the number that fell above, 'Correct' stands for the number of 
samples with correct choice of EV and 'Negative' stands for the number of samples with negative 

product of interval endpoints. 
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Figure 1: Coverage frequencies. The left column corresponds to Q 95 , the right to Q 99 . The 
first row corresponds to a sample size of 100, the middle row to sample size 50 and the 
bottom row to sample size 25. 
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Figure 2: Ratio of length of likelihood-confidence intervals for Q95 (top) and Q99 (bottom) 
for the submodel over length of intervals for the GEV, sample size 100. 
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Figure 3: Ratio of length of likelihood-confidence intervals for Q^ (left) and Q99 (right) for 
the submodel over length of intervals for the GEV, sample sizes 50 (top) and 25 (bottom). 
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Figure 4: Length of profile likelihood-confidence intervals for Q95 (top) and Q99 (bottom) 
for the GEV, sample size 100. 
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Figure 5: Length of profile likelihood-confidence intervals for Q95 (left) and Qgg (right), 
sample sizes n = 50 (top) and n = 25 (bottom) for the GEV. One outlying sample was 
excluded from plots (c) and (d). 
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Difference between Q and Q for the GEV 




Figure 6: Difference between Qoi and Qgg for the GEV models with a = b = 1 and corre- 
sponding values of c. 
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Figure 7: Rain data example: (a) Relative profile likelihood of GEV shape parameter c. (b) 
Relative profile likelihood of threshold parameter in three parameter Frechet model. 
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Figure 8: Rain data example: Relative profile likelihood of (a) Q.95, (b) Q.99. 
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Figure 9: Rain data example: (a) Q-Q plot for the two parameter Frechet model, (b) Return 
period plot. 
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